An Extensible Crosslinguistic Readability Framework

نویسندگان

  • Jesse Kirchner
  • Justin Nuger
  • Yi Zhang
چکیده

Automatic assessment of the readability level (i.e., the relative linguistic complexity) of documents in a large number of languages is an important problem that can be applied to many real-world applications, such as retrieving age-appropriate search engine results for kids, constructing automatic tutoring systems, and so on. Unfortunately, existing readability labeling techniques have only been applied to a very small number of languages. In this paper, we present an extensible crosslinguistic readability framework based on the use of parallel corpora to quickly create readability software for thousands of languages, including languages for which no linguists are available to define readability rules or for which documents with readability labels are lacking to train readability models. To demonstrate our idea, we developed a system based on the proposed framework. This paper discusses the theoretical and practical issues involved in designing such a system and presents the results of an experiment conducted with the system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DeXteR- An Extensible Framework for Declarative Parameter Passing in Distributed Object Systems

In modern distributed object systems, reference parameters are passed to a remote method based on their runtime type. We argue that such type-based parameter passing is limiting with respect to expressiveness, readability, and maintainability, and that parameter passing semantics should be decoupled from parameter types. We present declarative parameter passing, an approach that fully decouples...

متن کامل

Testing Extensible Language Debuggers

Extensible languages allow incremental extensions of a host language with domain specific abstractions. Debuggers for such languages must be extensible as well to support debugging of different language extensions at their corresponding abstraction level. As such languages evolve over time, it is essential to constantly verify their debugging behavior. For this purpose, a General Purpose Langua...

متن کامل

A Coordination Module for a Crosslinguistic Grammar Resource

The Grammar Matrix is a resource for linguists writing grammars of natural languages; however, up to this point it has not included support for coordination. In this paper, we survey the typological range of coordination phenomena in the world’s languages, then detail the support, both syntactic and semantic, for those phenomena in the Grammar Matrix. Furthermore, we describe the concept of a M...

متن کامل

EFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series

This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...

متن کامل

EFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series

This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009